Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

filter out unconnected pseudo nodes just once, at the end #2951

Merged
merged 7 commits into from
Nov 28, 2017

Conversation

rade
Copy link
Member

@rade rade commented Nov 24, 2017

The current rendering pipeline looks like this:

render . filterUnconnectedPseudo . memoise . filter(options)

We render a report, filter unconnected pseudo nodes, memoise the result, and apply user-specified filters.

Some of the steps are composites. filterUnConnectedPseudo nodes colours connected nodes in one step and then applies an ordinary filter that drops uncoloured nodes:

filterUnconnectedPseudo = colorConnected . filter(unconnectedPseudo)

And filtering always filters unconnected pseudo nodes:

filter(x) = filter'(x) . filterU

where filter'(x) is the "raw" filter operation and filterU is the "filter unconnected pseudo nodes" step.

So the current pipeline actually looks like this:

render . colorConnected . filter'(unconnectedPseudo) . filterU . memoise . filter'(options) . filterU

There are couple of problems with this. Firstly, we filter unconnected pseudo nodes three times, which is inefficient. Secondly, colorConnected marking is expensive since it updates nodes' LatestMap.

Thirdly, while the filtering done by colorConnected . filter'(unconnectedPseudo) drops nodes which are only connected to themselves, filterU fails to do that. Hence end result may contain such nodes because the colorConnected step and subsequent filtering occurs prior to applying the user-specified filters.

The changes in this PR yield a pipeline like this:

render . memoise . filter'(options) . filter'(unconnectedPseudo')

where unconnectedPseudo' is a filter predictate that is dynamically constructed from a calculation of a connected nodes, that correctly excludes self-connections.

One slight complication is that for the process topology we want to filter all unconnected nodes, not just unconnected pseudo nodes. We do this by calculating connected nodes differently in that case.

There is one wart: we cannot get rid of ColorConnected completely. We still need it in ColorConnectedProcessRenderer, which uses it to mark nodes; the marks are required by detailed.processNodeSummary to determine whether a process in the details panel can be rendered as a link.

@rade rade requested a review from bboreham November 24, 2017 10:29
@rade
Copy link
Member Author

rade commented Nov 24, 2017

There's no significant improvement in performance, which is ok since correctness and saner structure alone are sufficient motivators for this PR.

Best result of three of running go test -tags 'unsafe' -run=x -cpu=2 -bench=Topology -bench-report-file=/tmp/prod-on-prod-report.json
master

BenchmarkTopologyList-2         	       2	 696855104 ns/op	97747168 B/op	  918009 allocs/op
BenchmarkTopologyHosts-2        	       2	 553922460 ns/op	72166384 B/op	  734743 allocs/op
BenchmarkTopologyContainers-2   	       5	 269104905 ns/op	30147406 B/op	  303597 allocs/op

branch

BenchmarkTopologyList-2         	       2	 670668899 ns/op	94420696 B/op	  908547 allocs/op
BenchmarkTopologyHosts-2        	       2	 511880816 ns/op	71913592 B/op	  733575 allocs/op
BenchmarkTopologyContainers-2   	       5	 253800350 ns/op	29085857 B/op	  302659 allocs/op

And time curl -s "http://localhost:4040/api/topology/<topology>...

master branch
processes 0.236 0.240
containers 0.706 0.679
pods 0.464 0.566
services 0.549 0.484
hosts 0.690 0.652

@rade
Copy link
Member Author

rade commented Nov 24, 2017

I've checked the correctness by verifying that the output of

echo > report.json
for r in processes containers pods hosts \
         "containers?pseudo=hide&namespace=default,scope&system=application&stopped=running" \
         "pods?pseudo=hide&namespace=default,scope" ; do \
    echo $r
    curl -s "http://localhost:4040/api/topology/$r" | jq -S '.' >> report.json; \
done

is identical between master and branch.

Copy link
Collaborator

@bboreham bboreham left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I found this PR very hard to follow. Maybe that is unavoidable.

func isNotBar(node report.Node) bool {
return node.ID != "bar"
}
var isNotBar = render.Transformers([]render.Transformer{

This comment was marked as abuse.

This comment was marked as abuse.

@@ -164,14 +164,14 @@ func getTestContainerLabelFilterTopologySummary(t *testing.T, exclude bool) (det

var (
topologyRegistry = app.MakeRegistry()
filter render.FilterFunc
filterFunc render.FilterFunc

This comment was marked as abuse.

This comment was marked as abuse.

nodes.Nodes[nodeID] = node
nodes.Filtered--
}
nodes = transformer.Transform(nodes)

This comment was marked as abuse.

This comment was marked as abuse.

@@ -553,9 +553,9 @@ func (r *Registry) RendererForTopology(topologyID string, values url.Values, rpt
}
}
if len(filters) > 0 {
return topology.renderer, render.ComposeFilterFuncs(filters...), nil
return topology.renderer, render.Transformers([]render.Transformer{render.ComposeFilterFuncs(filters...), topology.filter}), nil

This comment was marked as abuse.

This comment was marked as abuse.

@rade rade force-pushed the one-pass-unconnected branch from cbfc7af to 2b5bc80 Compare November 27, 2017 17:39
@rade
Copy link
Member Author

rade commented Nov 27, 2017

@bboreham Addressed comments. PTAL

rade added 7 commits November 28, 2017 06:54
Instead of three passes
1. building a 'connected' node set
2. marking nodes from that set with a LatestMap entry
3. removing unmarked nodes

we just do two
1. building a 'connected' node set
2. removing nodes not in that set

This does entail duplication of the adjecency list pruning code from
FilterFunc.Apply. We will be able to eliminate that eventually, but
not just yet.

Also, we cannot get rid of ColorConnected completely;
ColorConnectedProcessRenderer uses it to mark nodes, and that
information is required by detailed.processNodeSummary to determine
whether a process in the details panel can be rendered as a link.
so we can generalise the filter step in render.Render et al. That will
allow us to apply whole-topology filters in that step.
This is a step towards filtering unconnected nodes after all custom
filters have been applied.
...rather than before. That way, nodes which become unconnected during
filtering are removed, which is what we want. ATM we are depending on
some 'unconnected' filtering inside every filter, which is expensive
and largely redundant. We should soon be able to remove that.

downside: 'unconnected' filtering is no longer memoised.
It's now done via a special filter, once, after all other filters have
been applied.

Some tests need updating since they were relying on ordinary filters
doing that filtering.
This eliminates the awkward distinction between ProcessRenderer and
ColorConnectedProcessRenderer.

It also ensures that processes resulting from direct rendering of the
process topology (/api/topology/processes is invoking
ProcessWithContainerNameRenderer and /api/topology/processes-by-name
is invoking ProcessNameRenderer) are colored and hence summarising
them correctly sets the 'linkable' property. This was the behaviour
prior to the revamping of the rendering pipeline. However, it doesn't
actually make a practical difference since process detail panels only
show other processes as connection endpoints, and these are always
marked linkable anyway.
@rade rade force-pushed the one-pass-unconnected branch from 2b5bc80 to 9563036 Compare November 28, 2017 07:36
@rade rade merged commit e73ba58 into master Nov 28, 2017
@rade rade deleted the one-pass-unconnected branch December 25, 2017 10:11
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants